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According to quantum theory, the outcomes of future measurements cannot (in general) be pre- 
dicted with certainty. In some cases, even with a complete physical description of the system to be 
measured and the measurement apparatus, the outcomes of certain measurements are completely 
random. This raises the question, originating in the paper by Einstein, Podolsky and Rosen [T], 
of whether quantum mechanics is the optimal way to predict measurement outcomes. Established 
arguments and experimental tests exclude a few specific alternative models [^15). Here, we pro- 
vide a complete answer to the above question, refuting any alternative theory with significantly 
more predictive power than quantum theory. More precisely, we perform various measurements on 
distant entangled photons, and, under the assumption that these measurements are chosen freely, 
we give an upper bound on how well any alternative theory could predict their outcomes [TS]. In 
particular, in the case where quantum mechanics predicts two equally likely outcomes, our results 
are incompatible with any theory in which the probability of a prediction is increased by more than 
~0.19. Hence, we can immediately refute any already considered or yet-to-be-proposed alternative 
model with more predictive power than this. 



Many of the predictions we make in everyday life are 
probabilistic. Usually this is caused by having incom- 
plete information, as is the case when making weather 
forecasts. On the other hand, even with all the informa- 
tion available within quantum mechanics, the outcome of 
certain experiments, e.g., the path taken by a spin-half 
particle in a Stern-Gerlach experiment, is generally not 
predictable before the start of the experiment. This lack 
of predictive power has prompted a long debate, which in 
turn led to important fundamental insights. In particu- 
lar, Kochen and Specker, and independently Bell, proved 
that there cannot exist any noncontextual theory that 
predicts observations with certainty |T7| [TH] . In a similar 
vein. Bell showed [51 that in general there cannot ex- 
ist any additional hidden property of the particle (a local 
hidden variable) that completely determines the outcome 
of any measurement on the particle (for an illustration of 
such a model see Fig. [T]) . Bell's argument relies on the 
fact that entangled particles give rise to correlations that 
cannot be reproduced in any local hidden variable theory. 
The existence of such correlations has been confirmed in a 
series of increasingly sophisticated experiments [IHSl [11] . 

The purpose of the above arguments was to refute the- 
ories in which hidden parameters determine any exper- 
imental outcomes. Access to these parameters would 
allow us, in principle, to predict the outcomes of any 
experiment with certainty. However, these arguments 
do not preclude the possibility that we find a theory 
that has more predictive power than quantum mechan- 
ics, while remaining probabilistic. Consider again the 
Stern-Gerlach example where, according to quantum me- 
chanics, a particle entering the apparatus with a certain 
spin orientation may be deviated in one of two direc- 
tions, each with probability 0.5. One may now conceive 
of a theory that, depending on an additional parame- 



ter, would allow us to predict the direction of deviation 
with a larger probability, say 0.75, thereby improving 
the quantum mechanical prediction by 0.25. In Fig. [T] we 
describe an example of such a theory, which essentially 
corresponds to a proposal put forward by Leggett [4^ . 

In this Letter we present experimental data that 
bounds the probability, 5, by which any alternative the- 
ory could improve upon predictions made by quantum 
theory while still being consistent with the assumption 
that measurement settings can be chosen freely. We find 
that quantum theory is close to optimal in terms of its 
predictive power. Our work relies on a recent theoreti- 
cal argument [TB] , which is itself based on a sequence of 
work |19H23| . partly in the area of quantum cryptogra- 
phy. The experiment requires measuring bipartite corre- 
lations of entangled particles, as well as establishing the 
distributions of the associated individual measurement 
outcomes, for a sufficiently large number of measurement 
settings. The maximum increase of predictive power, 5, 
of any alternative theory then depends on the strength 
of the measured correlations, /, and on the bias in the 
individual outcomes, u: 

5<[+^- (1) 

(These quantities are defined in the Appendix, where we 
also explain the procedure for obtaining / and v from 
experimental data.) 

Before describing the experimental setup, let us brieffy 
review the main features of the theory leading to Eq. [l] (a 
complete derivation is given in the Appendix). Crucially, 
the framework used is operational, in the sense that it 
refers only to directly observable quantities, such as mea- 
surement outcomes. For example, the Stern-Gerlach ex- 
periment mentioned above outputs a binary value, X, 
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FIG. 1: Two alternative models. Consider an experiment 
in which a source emits two spin-half particles travelling to 
two distant sites. Their spin direction is measured, e.g., by 
letting them pass through a filter that absorbs particles with 
the opposite spin direction. If the particles are initially max- 
imally entangled, then the probability of correctly predicting 
whether the particle on the left is transmitted by the filter, 
which has a direction a, is, according to quantum mechan- 
ics, given by pqm = 0.5. a, Bell's model of hidden variables 
|18| as an example for an alternative deterministic theory. 
Bell proposed a model in which the outcome of such a mea- 
surement is precisely determined by the particle's quantum 
mechanical state vector \^), the measurement specified by a, 
and an additional real number A - a local hidden variable 
(LHV) that is not present in standard quantum mechanics. 
If we had access to A, we could predict the outcome of the 
measurement with certainty, hence Plhv = 1. b, A Leggett- 
type model |4] as an example for an alternative probabilistic 
theory. Leggett imagined a theory in which each particle car- 
ries a hidden parameter, specified as a vector z that may be 
seen as a "classical spin" . His model, adapted to general spin 
particles, prescribes that the probability that a particle with 
vector z is transmitted by a filter in direction ct is given by 
^ + z ■ a. Since the vector z is unknown, we model it here 
as a random variable with no preferred direction (a detailed 
discussion of this and other distributions over z is deferred 
to the Appendix). A straightforward calculation then shows 
that, if we had access to the parameter z, we could on average 
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indicating in which direction the particle deviated. We 
associate with X a time coordinate t and three spatial co- 
ordinates (xi, X2, X3), corresponding to a point in space- 
time where the value X can be observed. (If the value 
can be observed at different points in space time, we may 
define further copies of X with different spacetime co- 
ordinates.) We note that these coordinates can be de- 
termined operationally (e.g., using clocks and measuring 
rods, with respect to a fixed reference system). We call 
such observable values with spacetime coordinates space- 
time variables (SVs). In the same manner, any param- 
eter that is needed to specify the experiment (e.g., the 
orientation of the Stern-Gerlach apparatus) can be mod- 
elled as an SV. 

Consider now an experiment in which a spin measure- 
ment is made on a particle that is maximally entangled 
with another one. According to quantum theory, the 
outcome, X, of this measurement is random, even with a 
complete description of the measurement apparatus, A. 
However, an alternative theory may provide us with ad- 
ditional information, S (which can also be modelled in 



terms of SVs [H]). We can then ask whether this addi- 
tional information S can be used to improve the predic- 
tions that quantum mechanics makes about X, which de- 
pend on the measurement setting A and the initial state 
(which we assume to be fixed). This question has a neg- 
ative answer if the distribution of X, conditioned on A, 
is unchanged when we learn S. This can be expressed in 
terms of the Markov chain condition [25] , 

X ^ A^E 

Equation [T] now places a bound on the maximum prob- 
ability, 6, by which this condition is violated. In other 
words, the predictions obtained from quantum theory are 
optimal except with probability (at most) S. 

For the specific measurement described above, the va- 
lidity of Eq. [1] relies only on the natural (and often im- 
plicit) assumption that measurement parameters can be 
chosen freely. This assumption can be expressed in the 
above framework as the requirement that the SV corre- 
sponding to a measurement parameter. A, can be chosen 
such that it is statistically independent of all SVs whose 
coordinates lie outside the future lightcone of A (Bell's 
theorem also relies on such an assumption, for example, 
as explained in Ref. I^S)) . When interpreted within the 
usual relativistic spacetime structure, this is equivalent to 
demanding that A is uncorrelated with any pre-existing 
values in any frame. We also note that this requirement 
can be seen as a prerequisite for non-contextuality, as 
pointed out in Ref. (where an alternative proof that 
quantum theory cannot be extended, based on the as- 
sumption of non-contextuality, is offered). 

We note that our bound on the predictive power of al- 
ternative theories can be extended to arbitrary measure- 
ments (not necessarily on maximally entangled particles) 
if one makes one additional assumption. This assumption 
is that the evolution of the state of a physical system can 
always be correctly described by a unitary operation if 
one includes part of the environment in the description 
of the process [TO] . 

The experimental setup we use to bound the quantities 
on the right-hand side of Eq. [l]is detailed in Fig. [2] Our 
source |27j generates photon pairs with high fidelity to 
the entangled state 
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where \H) and \V) represent horizontal and vertical po- 
larization states, respectively, and replace the usual spin- 
up and spin-down notation for spin-half particles. The 
photons from each pair are separated and sent towards 
polarization analyzers that can be adjusted to measure 
the polarization of an incoming photon along any desired 
direction S — (5+, S'l, S"//), where the three-component 
vector S is expressed in terms of its projections onto di- 
agonal (-1-45°), left-circular (L), and horizontal (H) po- 
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FIG. 2: Generating and measuring entangled states, a, Experimental setup. A diagonally polarized, continuous wave, 
532 nm wavelength laser beam is split by a polarizing beam splitter (PBS) and travels both clockwise and counter-clockwise 
through a polarization Sagnac interferometer. The interferometer contains two type-I, periodically poled lithium niobate 
(PPLN) crystals configured to produce coUinear, non-degenerate, 810/1550 nm wavelength photon pairs by means of spon- 
taneous parametric down-conversion. As photon-pair generation is polarization dependent, the clockwise-travelling, vertically 
polarized (counter-clockwise travelling, horizontally polarized) pump light passes through the first crystal without interaction 
and may down-convert in the second crystal to produce two horizontally (vertically) polarized photons. For sufficiently small 
pump power, recombination of the two bi-photon modes on the PBS yields the entangled \4>'^) state. After exiting the inter- 
ferometer, the remaining pump light is filtered out using a high-pass filter. The entangled photons are separated on a dichroic 
mirror and sent to analyzers that allow one to measure polarization along any desired direction on the Bloch sphere. They 
consist of quarter wave plates (QWP), half wave plates (HWP), PBSs and single photon detectors. The 810 nm photons are 
detected using a free-running Silicon avalanche photo-diode (Si APD), and 1550 nm photons are detected using an InGaAs 
APD triggered by detection events from the Si APD. b, Density matrix. Density matrix p^eai of the bi-photon state produced 
by our source as calculated via maximum- likelihood quantum state tomography [28] (see the Appendix for actual values). The 
fidelity, F — {<f)^\pica.\\<t^^) , between the detected state, Proai, and the ideal state, \4)'^), given by Eq.[2) is (98. 1± 0.1)%. 



larized components. S is conveniently depicted on the 
Bloch sphere, see Fig[3j 

We perform a series of experiments that are parame- 
terized by an integer, N . Each experiment yields a value, 
5n = In/'2 + J^AT, and we bound S by the minimum over 
all measured Sn- Each experiment comprises N pairs of 
opposing spin measurement settings per analyzer (i.e., N 
different bases): 

/ cos {m^) \ 
SaM^I sm{m^) Lme {0,2,4,..., (47V -2)}, 

(3) 

and 

/ cos{n^) \ 
Sein)^ -sm{n^) , n £ {1, 3, 5, . . . , (4iV - 1)}. 

(4) 

For each setting S^(m), we count detected photons over 
80 sec to establish the bias z^at. Furthermore, for certain 
joint measurements (described by specific combinations 
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In 


UN 


Sn 


2 


0.6196 ±0.0049 


0.0027 ±0.0003 


0.3125 ±0.0025 


3 


0.4802 ± 0.0046 


0.0036 ± 0.0003 


0.2437 ±0.0023 


4 


0.4103 ± 0.0046 


0.0043 ± 0.0003 


0.2094 ±0.0023 


5 


0.3940 ± 0.0045 


0.0045 ± 0.0003 


0.2015 ±0.0023 


6 


0.3791 ±0.0041 


0.0047 ±0.0003 


0.1942 ±0.0021 


7 


0.3872 ± 0.0042 


0.0048 ± 0.0003 


0.1984 ±0.0021 



TABLE I: Summary of Results. The table shows values 
for 7jv, bias v^, as well as 5jv — In + vn ■ Statistical un- 
certainties (one standard deviation) are calculated from mea- 
surement results assuming Poissonian statistics. 



of S Ai^n) and Ssin)), we also register the number of 
detected photon pairs over 40 sec to calculate Jjv, and 
hence 5n (see the Appendix). 

Our experimental results are depicted in Fig. [3] and 
summarized in Table |T) We measured Sn for iV = 2 to 
iV = 7 and found the minimum, Sq = 0.194 ± 0.003, 
for = 6. Hence, the probability by which the pre- 
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FIG. 3: Measurements and results, a, Measurement 
settings. Graphical depiction of tiie polarization measure- 
ments along Sa (red) and Sb (blue). The example shows 
N = A. b, Results. Experimentally obtained values 5n (blue 
diamonds) with one-standard-deviation uncertainties calcu- 
lated from measurement results assuming Poissonian statis- 
tics. Also shown is a curve joining the values predicted by 
quantum theory, including one-standard-deviation statistical 
uncertainties (solid red line and grey shaded area, respec- 
tively), calculated from the measured density matrix Proai- 
The bounds of the shaded region are derived using Monte 
Carlo simulations and are consistent with the observed vari- 
ations of the measured values. Finally, the dashed blue line 
is the theoretical curve, again calculated using quantum the- 
ory, that assumes the ideal \ <j)^) state, as in Eq. l2| and perfect 
experimental apparatus with zero noise. It asymptotically ap- 
proaches zero as A'' tends to infinity. For instance, for N = 6 
we find = 0.102. 



dictions made by quantum theory can be improved us- 
ing any alternative theory is at most ^0.19. As exam- 
ples, note that this value rules out local hidden variable 
as well as Leggett-type models (as explained in Fig. [l] 
and the Appendix), since plhv — Pqm — 0.5 > S and 
PLcggctt - Pqm = 0.25 > 6, respectively. (Here, p de- 
notes the maximum probability of correctly predicting 
the measurement outcome in the model/theory indicated 
in the subscript.) We remark that our experiments do not 
close the locality and detection loopholes, so, strictly, the 
above conclusions hold modulo the assumption that sim- 
ilar experiments closing these loopholes vifould show the 
same results. 

In conclusion, under the assumption that measure- 
ments can be chosen freely, no theory can predict mea- 
surement outcomes substantially better than quantum 
mechanics. In other words, any already considered or 
yet-to-be-proposed theory that makes significantly bet- 
ter predictions would either be incompatible with the ex- 
perimental observations presented herein, or be incom- 



patible with our assumption that the measurement pa- 
rameters can be chosen freely. While the former is true, 
for example, for local hidden variable theories (as already 
pointed out by Bell [2]) or for the Leggett model [1], the 
de Broglie-Bohm theory [3U] is an example of the sec- 
ond type — the theory cannot incorporate measurement 
parameters that satisfy our free choice assumption. 
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APPENDIX 



Calculation of bias, vn, and correlation strength, Jjv 

The bias is a measure of how close the distribution of the individual measurement outcomes, X, is to uniform. It is 
calculated from the number of 810 nm wavelength photons detected in opposing spin measurements. In general, this 
number is not the same for all pairs of measurements. We take the bias, i^at, to be the maximum over the individual 
biases. Denoting the number of detected photons for setting S'^(m) by M{m), we have: 

_1 f \Mia)~M{a + 2N)\ \ 

~ 2 ae{o,2,W2JV-2)} 1 Af (a) + M{a + 27V) / ' 

The quantity Ijq measures the strength of the bipartite correlation between two analyzers. It is defined by 

= P(0, 27V - 1) + (6) 

a. fa 

a-fa| = l 

where a e {0, 2, 4, . . . {2N ~ 2)}, 6 e {1, 3, 5, . . . (2iV - 1)}. Furthermore, F(a, h) is the sum of the probabilities for 
detecting two photons from a pair along the spin vectors S A^fl) and Ssib), and along — S/i(a) and —Ssib) (i.e. the 
probability of correlated outcomes): 

p. ^ M(a,6)+M(a + 27V,b + 27V) 

' M{a, h) + M(a, h + 2N) + M{a + 2N, b) + M{a + 2N,b + 2N) ^ ' 

where, e.g., M{a,b) is the number of joint photon detections for measurements along 5^(a) and Ssib)- 



Proof of the bound 

In this section, we prove the bound given in Equation (1) in the main text, which is stated as Lemma [l] below. 
We use a bipartite scenario in which two spacelike separated measurements are performed on a maximally entangled 
state. We denote the choices of observable A £ {0,2,..., 2N — 2} and B S {1,3,..., 2N — 1} and their outcomes 
X e {+1,-1} and Y e {+1,-1}, respectively^. We additionally consider information that might be provided by an 
alternative theory (this was denoted S in the main text), which is modelled as an additional system with input C and 
output Z |16j . If one makes the assumption that the measurements can be chosen freely, then the joint distribution 
PxYZ\ABC satisfies the non-signaling conditions 

PxY\ABC — PxY\AB (8) 
PxZ\ABC = PxZ\AC (9) 

Pyz\abc — Pyz\bc (10) 

(see [H] for a short proof of this) . 

Lemma [T] gives a bound on the increase in predictive power of any alternative theory in terms of the strength of 
correlations and the bias of the individual outcomes. The bound is expressed in terms of the variational distance 
D{Pz,Qz) ■= \Pz{z) — Qz{z)\, which has the following operational interpretation: if two distributions have 

variational distance at most S, then the probability that we ever notice a difference between them is at most 6. 

The bias is quantified by^ vn := max^ Z?(Px|ai ^x)i where Px is the uniform distribution on X. To quantify the 
correlation strength, we define 

In -.^ PiX ^Y\A^O,B ^2N -1) + ^ P{X ^ Y\A ^ a, B ^ b) . (11) 

a, fa 
|»-fa| = l 



^ Note that the measurements we speak of in the Appendix have a slightly different form than those in the main text. Specifically, we now 
assume that measurements behave ideally, projecting onto one of two basis elements and leading to one of the two outcomes ±1. In a real 
experiment, there is always the additional possibility of no photon detection (let us denote this outcome 0). The measurements discussed 
in the main text are configured to distinguish +1 from either —1 or 0, or to distinguish —1 from either +1 or 0. Both measurements 
are used in the experiment to infer the distribution of the ideal measurement with outcomes it 1 . 

^ A note on notation: we usually use lower case to denote particular instances of upper case random variables. 
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This is equivalent to Equation (6) in the main text. We remark that /at > 1 is a Bell inequality, i.e. is satisfied by 
any local hidden variable model. 

Lemma 1. For any non- signalling probability distribution, Pxyz\ABCj we have 

D{Pz\abcx,Pz\abc) < + '^N (12) 

for all a, b, c, and x. 

To connect this back to the main text, we remark that the Markov chain condition X <-> yl -r-7> S is equivalent to 
Pz\abcx — Pz\abc (which corrcspouds to S not being of use to predict X). Hence, from the operational meaning of the 
variational distance (given above), the left-hand-side of (12) corresponds to the maximum increase in the probability 
of correctly predicting the outcome X, denoted 6 in the main text. 

The proof is an extension of an argument given in |16| which is based on chained Bell inequalities |191 120) and 
generalizes results of pm23) . Many steps of this proof mirror those in [T^, which we repeat for completeness. 
Furthermore, note that the bound derived in this Lemma is tighter than that of [16] . 

Proof. We first consider the quantity In evaluated for the conditional distribution Pxy\ab,cz = PxY\ABCzi', 'K 'j c, z), 
for any fixed c and z. The idea is to use this quantity to bound the variational distance between the conditional distri- 
bution Px\acz and its negation, 1 — Px\aczj which corresponds to the distribution of X if its values are interchanged. 
If this distance is small, it follows that the distribution Px\acz is roughly uniform. 
For ao := 0, bo := 2N — 1, we have 

In{Pxy\ab.cz) = P{X = Y\A = ao,B = bo,C^c,Z = z)+ ^ PiX ^ Y\A = a, B = b,C = c, Z = z) 

a,b 
\a-b\ = l 

> ^(1 — Px\aobocz, PY\aobocz) + ^ D{Px\abcZT PY\abcz) 

\a-b\ = l 

— ^ Px\aoczj PY\bocz) + ^ DiPx\aczj PY\bcz) 

a,b 
\a-b\ = l 

^ -0(1 — Px\aoczj Px\aocz) 

= 2D{Pxia„bocz,Px) ■ (13) 

The first inequality follows from the fact that D{Px\ni PY\n) ^ P{X ^ ^|^^) for any event ft (a short proof of this 
can be found in [23j). Furthermore, we have used the non-signalling conditions Px\abcz — Px\acz (from ^) and 
PY\abcz = PY\bcz (from ([lO|), and the triangle inequality for D. By symmetry, this relation holds for all a and b. We 
hence obtain D{Px\abcz, Pji) < IIn{Pxy\ab,cz) for all a, b, c and z. 

We now take the average over z on both sides of ^13^ . First, the left hand side gives 

E 



Pz\abc{z)lN{PxY\AB,cz {PxY\AB,. 



^Y.PzWbA^)P{X = Y\aoM,c,z)+ J2 J2^z\abciz)PiX ^Y\a,b,c,z) 



\a-b\ = l 



^PiX^Y\ao,bo,c)+ J2 PiX^Y\a,b,c) 

a,b 
\c^b\ = l 

^ In{Pxy\ab^c) , (14) 

where we used the non-signalling condition Pz\a bc ~ Pz\c (which is implied by ^ and (lOl) several times. Next, 
taking the average on the right hand side of Q yields j^z Pz\abc{z)D{Px\abcz, Px) = P>{Pxz\abc, Px x Pz\abc), so 
we have 

2DiPxz\abc, Px X Pz\abc) < In{Pxy\ab,c) - In{Pxy\ab)- (15) 

The last equality follows from the non-signalling condition (|8| (if P{X = Y\a^b,c) or P{X ^ F|a, 6, c) depended on 
c, then there would be signalling from C to ^ and B). 



7 



Furthermore, note that 



2-D(PxZ|at)C) Px X Pz\abc) — ^ | ^XZ|a&c (~ 1 ; z) — -Pz\abc{z} \ + ^ | ^XZ|o&c ( + 1 ; z) — -Pz\abc{z)\ 

z z 

and that both of the terms on the right hand side are equal (since Pz\abc{z) = Pxz\abc{~^i + Pxz\abc{+'^i z)) i-^- 
Y.z\Pxz\abc{x-,z) - \Pz\abc{z)\ < ^ for all a, b, c and x. Note also that D{Px\a,Px) = \Px\a{x) - ^\ for all x. 
Combining the above, we have 



D{Pz\abcx, Pz\abc) — ^ | ^ ^Z|afcc^ (^) ~ 2'^Z\abciz)\ 

z 

< '^^l^Pziabcxiz) — Px\abc{x)Pz\abcx{z)\ + '^^\Px\abc{^)Pz\abcx{z) — -^Pzlabciz)] 

z z 

= 'Y^Pz\abcx{z)\^ - Px\abc{x)\ + X! I ^^-^^l'^''^ (2^' ^) ^ \Pz\abc{z)\ 



<D{Px\a.Px) 



2 

In{Pxy\ab) 



This establishes the relation (12). □ 



Tightness 

We can also establish that this bound is tight, as follows. Consider a classical model in which, with probability e, 
we have X = Y = Z = and otherwise X = Y = Z = +1 (independently of A, B and C). This distribution has 
In{Pxy\ab) ~ 1 and i/ = ^ — e. It also satisfies D{Pz\abcX=~l^ Pz\abc) = 1 — e> which is equal to the bound implied 



by (12) 



Application to Leggett models 

In the Leggett model [1] , one imagines that improved predictions about the outcomes for measurements on qubits 
are available. More precisely, each particle has an associated vector (thought of as a hidden direction of its spin) and 
the outcome distribution is expressed via the inner product with the vector describing the measurement (see Figure 1 
in the main text). Denoting the hidden vector for the first particle by z, and its measurement vector a (this is the 
Bloch vector associated with the chosen measurement direction), its outcomes are distributed according to 

i='x|„z(±l) = ^(lia.z). (16) 

To relate this back to the discussion above, the Leggett model corresponds to the case that there is no C, and where 
the hidden vectors are contained in Z. Note that Leggett already showed his model to be incompatible with quantum 
theory 4 and experiments have since falsified it using specific inequalities [12l [131 [E] ■ Here we discuss the model 
in light of our experiment, which, it turns out, is sufficient to falsify it. 

Note that, as presented in [4 and above, the model is not fully specified, since the distribution of the hidden vectors, 
z, is not given. In order to discuss the implications of our experimental results, we refer to four cases (corresponding 



to different distributions over z). Before describing these cases, we first note that (12 1 implies 



{D{Pxic..,Px)).<Sn, (17) 

for all a, where (•)z denotes the expectation value over the vectors z. In order to falsify a particular version of the 
Leggett model, we compute S'j^^^, the smallest increase in predictive power under the assumption that a particular 



version of the Leggett model is correct (i.e. the smallest value of the left-hand-side of (17 1 over all a). We then 
show that (5^'* is above the maximum increase in predictive power compatible with the experimental data, Sn-, hence 
falsifying that version of Leggett 's model. 

First Case: We imagine that the vector z is a fixed vector (i.e. -Pz(z) = 1) in the same plane on the Bloch sphere 
as our measurements. From (17) we find D{Px\az7 Px) ^ for all a. However, (16) implies D{Px\az: Pjc) — 
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iV 


rcritl 

Oat 


ccrit2 
On 


ecritS 
Off 


rcrit4 

On 


Sn 


5% 


2 


0.3536 


0.2 


0.25 


0.1768 


0.3125±0.0025 




3 


0.4330 


0.3062 


0.25 


0.2165 


0.2437±0.0023 




4 


0.4619 


0.3266 


0.25 


0.2310 


0.2094±0.0023 




5 


0.4755 


0.3362 


0.25 


0.2378 


0.2015±0.0023 




6 


0.4830 


0.3415 


0.25 


0.2415 


0.1942±0.0021 


0.2297 ±0.0020 


7 


0.4875 


0.3447 


0.25 


0.2437 


0.1948±0.0021 






0.821 


0.906 


0.946 


0.951 





TABLE II: Leggett models: critical values and experimental data. This table shows the critical values of 5n required 
to rule out each of the four Leggett-type models discussed in the text. Also shown are measured values for Slf and S%, where 
the superscript refers to measurements in the |+) — \L} plane, and the |+) — \H) plane of the Bloch sphere, respectively. Bold 
values have Slf < 5^/'* and, if required S% < (5^'', i.e. they are ruled out by the data for that A*'. Note that the measurement 
in the second, orthogonal plane was only done for A = 6. This value is relevant for ruling out the second and fourth cases. In 
the last row of the table, we note the minimum visibility required to rule out each of the four models. 



In order to make maxo, \a..z\ as small as possible, i.e. find 5™ , '^^ require the vectors z to be as far as possible from 
any of the possible a vectors. If the fixed vector z is in the plane containing the measurements, this condition leads 
to maxo, \a.z\ = cos ^ (i.e. z is positioned exactly in between two settings for a). Hence, this specific version of the 
Leggett model is falsified if the measured 6n < SJ^^^^ = \ cos As shown in Table |llj this is the case for all values 
of N assessed. 

According to quantum theory, appropriately chosen measurements on a maximally entangled state lead to corre- 
lations for which (5jv = — cos -^)- However, no experimental realization can be noise-free, and this affects the 
minimum 5n attainable (see [111 [SI])- One way to characterize the imperfection in the experiment is via the visibility. 
In an experiment with visibility we instead obtain Sn = y (1 — T^cos 2^), which for fixed V has a minimum at 
finite N . In the case of this model, the minimum visibility required to falsify it is 0.821 (with such a visibility the 
model could be ruled out with = 3). 

Second case: We now suppose z is a fixed vector, not confined to the plane. Then our basic measurements cannot 
strictly rule out this model: in principle, z could be close to orthogonal to the plane containing the measurement 
vectors. (We remark that if z is completely orthogonal to this plane, then it would not be useful for making predictions.) 
However, in order to rectify this we can include a second set of measurements in the set of random choices. This set 
should be identical to the first apart from being contained in an orthogonal plane. We denote the sets A\ and A2 and 
we separately measure the i5jv values for each plane, generating values denoted 5\f and 5^. Analogously to the first 
case discussed above, this version of the Leggett model is falsified unless for all a. € A1UA2, \oi.z\/2 < min(5]y, Sj^). In 
order to make maxo, \a.z\ as small as possible, we require the vectors z to be as far as possible from any of the possible 
a vectors. Consider now the four vectors (0, sin </>, cos (/>), (0, — sinc/i, cos 0), (cos 0, sin </>, 0) and (cos 0, — sin0, 0) for 
^ f (these represent two neighbouring pairs of measurement vectors (one in each plane), where we have chosen the 
coordinates such that they are symmetric). The vector equidistant from these (in their convex hull) is i^^^i :^)- 
It is then not possible that for all a £ Aiii A2, \a.z\/2 < min((5]y, Sj^) provided max((5^, (5^) < (5^'*^ — cos 
As shown in Table [HI our experiment, which includes a measurement of in an orthogonal plane, also rules out this 
version of the Leggett model. (The minimum visibility required to rule out this model is 0.906, which could do so 
using iV = 4.) 

Third case: We consider a slightly modified model in which z is distributed uniformly over the Bloch sphere. This 
model is arguably more natural since it is somewhat conspiratorial for z to always take a particular orientation with 
respect to the measurements we perform (particularly if that measurement is chosen freely), and is the one referred 



to in the main text. In this case, defining 9 as the angle between ct and z, we compute the left hand side of (17) as 



|cos6'|sin6' 1 



This model is hence excluded if one finds 5n < '^Jv'*'^ ~ i (measurements are needed only in one plane). As shown in 
Table [TH this is the case for > 3. (The minimum visibility required to rule out this model is 0.946, which could do 
so for N — 5.) 

Fourth case: Here we return to our measurements in two orthogonal planes and ask whether our data is sufficient 
to falsify the model for any distribution over z. (We can think of this in terms of an adversarial picture. Suppose the 
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set of possible measurement choices is known to an adversary, who can pick the vector z according to any distribution 
he hkes. The aim is to show that our measurement results are not consistent with any such adversary.) For this model 
to be correct we need 

< S]^ for all a £ Ai 

< 5% for aU ol £ Ai- 

Again we can parameterize in terms of the four vectors introduced previously. When minimizing with respect to these 
four, we should take Pz to have support only on the set (sin6',O,cos0) (going off this line increases the inner product 
with measurement vectors in both sets). We thus have 

(|q; z|) — < I9 '^^ ^ for all a g ^1 

^ \ le d^/o(^?) sin cos 2^ for all a e A2 

where p{9) is the probability density over 9. 

In other words, non-zero p{6) gives contribution cos cos to the first integral, and sin cos to the second. 
In order that both integrals are equal, we should take p{0) to be symmetric about = ^. For functions with 
this symmetry, non-zero p{0) gives contribution (sinfl -I- cos 9) cos 2^ to both integrals. The minimum of this over 
< < ^ is cos which occurs for 9 = 0. It follows that the most experimentally challenging distribution to rule 
out is p{9) = ^((^e,o + ^e,f )? where S^^y is the Kronecker delta (this being the distribution that requires the lowest 
measured Sjy to eliminate). For this distribution, we have maxQ,(|Q:.z|)z/2 = j cos so this model is ruled out for 
max(5jy,(5^) < (5^'*^ = | cos Again, as detailed in Table [ill our experimental data is sufficient to do so. (The 
lowest visibihty that could rule out this case is 0.951, which would do so for N = 5). 



Comment on minimum visibilities required to rule out Leggett models 

Here we briefly compare the visibilities required to rule out Leggett models using our approach with those needed in 
previously considered Leggett inequalities. We remind the reader that the technique used in the present work generates 
conclusions that apply to arbitrary theories and were not developed with Leggett's model in mind. Nevertheless, use 
of this new approach to rule out Leggett models requires comparable visibilities to those of previously discussed 
inequalities. More specifically, the claimed minimum visibilities are 0.974 in Groblacher et al. [T^] and 0.943 for the 
alternative inequality of Branciard et al. |13l I15j , which is only slightly below the value we require to rule out all of 
the four models above. 

We note that the visibility for measurements in the plane used in the main text was 0.967 ± 0.007, while the 
visibility in the orthogonal plane (measured for the purposes of ruling out the second and fourth cases) was 0.977 ± 
0.009. 




FIG. 4: Measurement settings for A'^ = 6. All settings are in the plane in the Bloch sphere. Alice's settings are 

indicated in red and Bob's in blue. 
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Density Matrix and Raw Data 



The experimental settings as well as the associated measurement results that allow reconstruction of the density 
matrix are given in Table |III[ The most likely density matrix is detailed in Table |IV[ Note that this density matrix is 
not used for the calculation of experimental values for ^^Vj In or but is included to characterize our source. The 
measurements settings used to experimentally determine Sq are depicted in Figure [4j and Table |V] lists the results 
used to calculate (5g from the bi-partite correlation /g and the bias vq. 



Setting 


HWPa 


QWPa 


HWPb 


QWPb 


Rc 


ARc 


a 


b 


n 


n 


n 


n 


(cps) 


(cps) 


\H) 


\H) 














240.0 


2.8 


\H) 


\V) 








45 





1.8 


0.2 


\H) 


1+) 








22.5 


45 


118.4 


2.0 


\H) 


|-> 








-22.5 


45 


125.3 


2.0 


\H) 


\R) 











45 


118.7 


2.0 


\H) 


\L) 











-45 


130.6 


2.0 


\y) 


\H) 


45 











2.0 


0.3 


\y) 


\V) 


45 





45 





230.6 


2.8 


\y) 


l+> 


45 





22.5 


45 


119.1 


2.0 


\y) 


|-> 


45 





-22.5 


45 


118.5 


2.0 


\y) 


\R) 


45 








45 


123.8 


2.0 


\y) 


\L) 


45 








-45 


112.4 


1.9 


\+) 


\H) 


22.5 


45 








115.7 


2.0 


\+) 


\V) 


22.5 


45 


45 





122.0 


2.0 


\+) 


l+> 


22.5 


45 


22.5 


45 


245.8 


2.9 


\+) 


|-> 


22.5 


45 


-22.5 


45 


3.6 


0.3 


1+) 


\R) 


22.5 


45 





45 


118.8 


2.0 


i+> 


\L) 


22.5 


45 





-45 


111.3 


1.9 


i-> 


\H) 


-22.5 


45 








124.2 


2.0 


i-> 


\V) 


-22.5 


45 


45 





116.6 


2.0 




\+) 


-22.5 


45 


22.5 


45 


3.7 


0.4 


i-> 


h> 


-22.5 


45 


-22.5 


45 


241.1 


2.8 




\R) 


-22.5 


45 





45 


124.5 


2.0 




\L) 


-22.5 


45 





-45 


138.5 


2.1 


\R) 


\H) 





45 








112.7 


1.9 


\R) 


\V) 





45 


45 





124.5 


2.0 


\R) 


l+> 





45 


22.5 


45 


119.2 


2.0 


\R) 


|-> 





45 


-22.5 


45 


121.7 


2.0 


\R) 


\R) 





45 





45 


2.9 


0.3 


\R) 


\L) 





45 





-45 


239.3 


2.8 


\L) 


\H) 





-45 








128.8 


2.1 


\L) 


\V) 





-45 


45 





109.0 


1.9 


\L) 


l+> 





-45 


22.5 


45 


115.8 


2.0 


\L) 


|-> 





-45 


-22.5 


45 


124.6 


2.0 


\L) 


\R) 





-45 





45 


230.8 


2.8 


\L) 


\L) 





-45 





-45 


4.0 


0.4 



TABLE III: Tomographic Data. This table shows raw data collected to find the density matrix shown in Table |IV[ The 
coincidence rates between the Si avalanche photodiode (APD) and the triggered 1550 nm InGaAs APD (Rc) for each set of 
photon analyzer settings are given in average counts per second (cps), as are their one standard deviation uncertainties (ARc)- 
Settings a and b were implemented using one quarter wave plate followed by one half wave plate in each analyzer. These 
waveplates were set at angles HWPa, QWPa, HWPa, and QWPa- Data collection time for each point was 30 seconds. 
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(a) PRe 



(b) Pim 





(HH\ 


(HV\ 


(VH\ 


{VV\ 




(HH\ 


(HV\ 


(VH\ 


(VV\ 


\HH) 


0.5038 


-0.0052 


-0.0092 


0.4851 


\HH) 


0.0000 


0.0155 


0.0138 


-0.0140 


\HV) 


-0.0052 


0.0040 


0.0001 


-0.0011 


\HV) 


-0.0155 


0.0000 


0.0017 


-0.0156 


\VH) 


-0.0092 


0.0001 


0.0043 


-0.0044 


\VH) 


-0.0138 


-0.0017 0.0000 


-0.0113 


\VV) 


0.4851 


-0.0011 


-0.0044 


0.4879 


\VV) 


0.0140 


0.0156 


0.0113 


0.0000 



TABLE IV: Density matrix. The real and imaginary parts of the density matrix generated by maximum Ukelihood quantum 
state tomography. 
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Setting 




HWPb 




Rc 


1-P{m,n) P{m,n) AP{m,n) 


1/ 


Au 


Tfl 


Tl 


(°) 


V / 




fens) 















11 





-41.25 


18692 


6.7 













12 


23 
23 




-45 


-86.25 
-86.25 


18597 
19017 


264.9 
7.5 


0.0259 


0.9741 


0.0011 


0.0047 


0.0003 




1 1 


-45 


-41.25 


18976 


270.6 















1 





-3.75 


18552 


267.3 













12 


13 
13 



-45 


-48.75 
-48.75 


18588 
18919 


9.2 
263.0 


0.9657 


0.0343 


0.0012 


0.0045 


0.0003 


iz 


1 


-45 


-3.75 


18900 


9.6 












2 


1 


-7.5 


-3.75 


18571 


267.8 












o 

Z 

1 /I 
14 


l^> 
l^> 


-7.5 
-52.5 


-48.75 
-48.75 


18632 
18772 


7.6 
263.3 


0.9697 


0.0303 


0.0012 


0.0039 


0.0003 


1 /I 


1 
1 


-52.5 


-3.75 


19018 


9.0 












2 


3 


-7.5 


-11.25 


18528 


266.8 












2 

14 


15 
15 


-7.5 
-52.5 


-56.25 
-56.25 


18746 
18910 


8.5 
268.4 


0.9665 


0.0335 


0.0012 


0.0042 


0.0003 




O 


-52.5 


-11.25 


18990 


10.1 












4 


3 


-15 


-11.25 


18712 


271.2 












4 
16 


15 
15 


-15 
-60 


-56.25 
-56.25 


18604 
18430 


7.9 
263.6 


0.9720 


0.0280 


0.0011 


0.0029 


0.0003 


ID 




-60 


-11.25 


18449 


7.6 












4 


5 


-15 


-18.75 


18058 


262.4 












4 
16 


17 
17 


-15 
-60 


-63.75 
-63.75 


17979 
18147 


7.8 
254.3 


0.9671 


0.0329 


0.0012 


0.0022 


0.0003 


10 





-60 


-18.75 


18201 


9.8 












6 


5 


-22.5 


-18.75 


18034 


262.0 












6 

1 Q 

lo 


17 

1 '7 
1 / 


-22.5 
-67.5 


-63.75 
-63.75 


18129 
18045 


7.8 
261.9 


0.9716 


0.0284 


0.0011 


0.0003 


0.0003 


lo 





-67.5 


-18.75 


18166 


7.5 












6 


7 


-22.5 


-26.25 


18044 


259.2 












6 
18 


19 
19 


-22.5 
-67.5 


-71.25 
-71.25 


18499 
18438 


8.5 
257.9 


0.9634 


0.0366 


0.0013 


0.0017 


0.0003 


1 Q 

lo 




-67.5 


-26.25 


18360 


11.1 












8 


7 


-30 


-26.25 


18354 


261.9 












8 

20 


19 
19 


-30 
-75 


-71.25 
-71.25 


18386 
18317 


7.3 
262.8 


0.9735 


0.0265 


0.0011 


0.0002 


0.0003 


zu 


i 


-75 


-26.25 


18403 


7.0 












8 


9 


-30 


-33.75 


18305 


261.7 












8 

20 


21 
21 


-30 
-75 


-78.75 
-78.75 


18254 
18066 


8.5 
256.7 


0.9659 


0.0341 


0.0012 


0.0020 


0.0003 


zu 


o 


-75 


-33.75 


18200 


9.8 












10 


9 


-37.5 


-33.75 


18042 


260.4 












10 
zz 


21 

Zl 


-37.5 
-82.5 


-78.75 
-78.75 


18102 
18100 


7.3 
257.1 


0.9696 


0.0304 


0.0012 


0.0002 


0.0003 


22 


9 


-82.5 


-33.75 


18073 


9.0 












10 


11 


-37.5 


-41.25 


17979 


256.3 












10 
22 


23 
23 


-37.5 
-82.5 


-86.25 
-86.25 


17958 
17857 


9.6 
256.3 


0.9615 


0.0385 


0.0013 


0.0005 


0.0003 


22 


11 


-82.5 


-41.25 


18014 


10.9 













TABLE V: Raw Data used to calculate S^. This table shows raw data collected to find 5^ = 0.1942 ± 0.0021. HWPa 
and HWPb are the half wave-plate settings that, together with quarter waveplate settings of —45° on side A and -1-45° on 
side B, realize the measurements corresponding to m and n as shown in Figure |4] The free running Silicon APD rates (Rsi) 
and the coincidence rates between the Si APD and the triggered 1550 nm InGaAs APD (Rc) are both given in average counts 
per second. P(m, n) is the probability of correlated outcomes and u is the bias for individual measurements as detailed above. 
Data collection time for each point was 40 seconds. Uncertainties are one standard deviation. 
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